Search CORE

822 research outputs found

Montague Grammar Induction

Author: Kim Gene Louis
White Aaron Steven
Publication venue: 'Linguistic Society of America'
Publication date: 02/03/2021
Field of study

We propose a computational model for inducing full-fledged combinatory categorial grammars from behavioral data. This model contrasts with prior computational models of selection in representing syntactic and semantic types as structured (rather than atomic) objects, enabling direct interpretation of the modeling results relative to standard formal frameworks. We investigate the grammar our model induces when fit to a lexicon-scale acceptability judgment dataset – Mega Acceptability – focusing in particular on the types our model assigns to clausal complements and the predicates that select them

Proceedings Published by the LSA (Linguistic Society of America)

Information and Incrementality in Syntactic Bootstrapping

Author: White Aaron Steven
Publication venue
Publication date: 01/01/2015
Field of study

Some words are harder to learn than others. For instance, action verbs like "run" and "hit" are learned earlier than propositional attitude verbs like "think" and "want." One reason "think" and "want" might be learned later is that, whereas we can see and hear running and hitting, we can't see or hear thinking and wanting. Children nevertheless learn these verbs, so a route other than the senses must exist. There is mounting evidence that this route involves, in large part, inferences based on the distribution of syntactic contexts a propositional attitude verb occurs in---a process known as "syntactic bootstrapping." This fact makes the domain of propositional attitude verbs a prime proving ground for models of syntactic bootstrapping. With this in mind, this dissertation has two goals: on the one hand, it aims to construct a computational model of syntactic bootstrapping; on the other, it aims to use this model to investigate the limits on the amount of information about propositional attitude verb meanings that can be gleaned from syntactic distributions. I show throughout the dissertation that these goals are mutually supportive. In Chapter 1, I set out the main problems that drive the investigation. In Chapters 2 and 3, I use both psycholinguistic experiments and computational modeling to establish that there is a significant amount of semantic information carried in both participants' syntactic acceptability judgments and syntactic distributions in corpora. To investigate the nature of this relationship I develop two computational models: (i) a nonnegative model of (semantic-to-syntactic) projection and (ii) a nonnegative model of syntactic bootstrapping. In Chapter 4, I use a novel variant of the Human Simulation Paradigm to show that the information carried in syntactic distribution is actually utilized by (simulated) learners. In Chapter 5, I present a proposal for how to solve a standing problem in how syntactic bootstrapping accounts for certain kinds of cross-linguistic variation. And in Chapter 6, I conclude with future directions for this work

Digital Repository at the University of Maryland

Intensional Gaps: Relating veridicality, factivity, doxasticity, bouleticity, and neg-raising

Author: Gantt Will
Kane Benjamin
White Aaron Steven
Publication venue: 'Linguistic Society of America'
Publication date: 05/01/2022
Field of study

We investigate which patterns of lexically triggered doxastic, bouletic, neg(ation)-raising, and veridicality inferences are (un)attested across clause-embedding verbs in English. To carry out this investigation, we use a multiview mixed effects mixture model to discover the inference patterns captured in three lexicon-scale inference judgment datasets: two existing datasets, MegaVeridicality and MegaNegRaising, which capture veridicality and neg-raising inferences across a wide swath of the English clause-embedding lexicon, and a new dataset, MegaIntensionality, which similarly captures doxastic and bouletic inferences. We focus in particular on inference patterns that are correlated with morphosyntactic distribution, as determined by how well those patterns predict the acceptability judgments in the MegaAcceptability dataset. We find that there are 15 such patterns attested. Similarities among these patterns suggest the possibility of underlying lexical semantic components that give rise to them. We use principal component analysis to discover these components and suggest generalizations that can be derived from them

Proceedings Published by the LSA (Linguistic Society of America)

On double access, cessation and parentheticality

Author: Altshuler Daniel
Hacquard Valentine
Roberts Thomas
White Aaron Steven
Publication venue
Publication date: 01/01/2015
Field of study

Arguably the biggest challenge in analyzing English tense is to account for the double access interpretation, which arises when a present tensed verb is embedded under a past attitude—e.g., "John said that Mary is pregnant". Present-under-past does not always result in a felicitous utterance, however—cf. "John believed that Mary is pregnant". While such oddity has been noted, the contrast has never been explained. In fact, English grammars and manuals generally prohibit present-under-past. Work on double access, on the other hand, has either disregarded the oddity (e.g., Abusch 1997: 39) or treated it as a reflex of a particular dialect (e.g., Kratzer 1998: 14). The goal of the paper is to argue—based on a corpus study—that a present-under-past sentence is grammatical, but modulated by two, interacting pragmatic phenomena: cessation and parentheticality

PhilPapers

Proceedings Published by the LSA (Linguistic Society of America)

Collecting Diverse Natural Language Inference Problems for Sentence Representation Evaluation

Author: Haldar Aparajita
Hu J. Edward
Pavlick Ellie
Poliak Adam
Rudinger Rachel
Van Durme Benjamin
White Aaron Steven
Publication venue
Publication date: 01/01/2018
Field of study

We present a large-scale collection of diverse natural language inference (NLI) datasets that help provide insight into how well a sentence representation captures distinct types of reasoning. The collection results from recasting 13 existing datasets from 7 semantic phenomena into a common NLI structure, resulting in over half a million labeled context-hypothesis pairs in total. We refer to our collection as the DNC: Diverse Natural Language Inference Collection. The DNC is available online at https://www.decomp.net, and will grow over time as additional resources are recast and added from novel sources.Comment: To be presented at EMNLP 2018. 15 page

arXiv.org e-Print Archive

Crossref

Scholarship, Research, and Creative Work at Bryn Mawr College | Bryn Mawr College Research

A computational model of S-selection

Author: Rawlins Kyle
White Aaron Steven
Publication venue: 'Linguistic Society of America'
Publication date: 15/10/2016
Field of study

We develop a probabilistic model of S(emantic)-selection that encodes both the notion of systematic mappings from semantic type signature to syntactic distribution – i.e., projection rules – and the notion of selectional noise – e.g., C(ategory)-selection, L(exical)-selection, and/or other independent syntactic processes. We train this model on data from a large-scale judgment study assessing the acceptability of 1,000 English clause-taking verbs in 50 distinct syntactic frames, finding that this model infers coherent semantic type signatures. We focus in on type signatures relevant to interrogative and declarative selection, arguing that our results suggest a principled split between cognitive verbs, which select distinct proposition and question types, and communicative verbs, which select a single hybrid type

Proceedings Published by the LSA (Linguistic Society of America)